AITopics | learning equilibria

Learning Equilibria in Matching Markets from Bandit Feedback

Neural Information Processing SystemsDec-23-2025, 20:07:50 GMT

Large-scale, two-sided matching platforms must find market outcomes that align with user preferences while simultaneously learning these preferences from data. But since preferences are inherently uncertain during learning, the classical notion of stability (Gale and Shapley, 1962; Shapley and Shubik, 1971) is unattainable in these settings. To bridge this gap, we develop a framework and algorithms for learning stable market outcomes under uncertainty. Our primary setting is matching with transferable utilities, where the platform both matches agents and sets monetary transfers between them.

learning equilibria, matching market, name change, (7 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.07)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.60)
Information Technology > Data Science > Data Mining > Big Data (0.59)

Add feedback

Learning Equilibria in Adversarial Team Markov Games: A Nonconvex-Hidden-Concave Min-Max Optimization Problem

Neural Information Processing SystemsMay-27-2025, 11:59:29 GMT

We study the problem of learning a Nash equilibrium (NE) in Markov games which is a cornerstone in multi-agent reinforcement learning (MARL). In particular, we focus on infinite-horizon adversarial team Markov games (ATMGs) in which agents that share a common reward function compete against a single opponent, *the adversary*. These games unify two-player zero-sum Markov games and Markov potential games, resulting in a setting that encompasses both collaboration and competition. Kalogiannis et al. (2023) provided an efficient equilibrium computation algorithm for ATMGs which presumes knowledge of the reward and transition functions and has no sample complexity guarantees. We contribute a learning algorithm that utilizes MARL policy gradient methods with iteration and sample complexity that is polynomial in the approximation error \epsilon and the natural parameters of the ATMG, resolving the main caveats of the solution by (Kalogiannis et al., 2023).

adversarial team markov game, learning equilibria, nonconvex-hidden-concave min-max optimization problem, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.60)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.40)

Add feedback

Learning Equilibria in Matching Markets from Bandit Feedback

Neural Information Processing SystemsMay-26-2025, 16:17:27 GMT

Large-scale, two-sided matching platforms must find market outcomes that align with user preferences while simultaneously learning these preferences from data. But since preferences are inherently uncertain during learning, the classical notion of stability (Gale and Shapley, 1962; Shapley and Shubik, 1971) is unattainable in these settings. To bridge this gap, we develop a framework and algorithms for learning stable market outcomes under uncertainty. Our primary setting is matching with transferable utilities, where the platform both matches agents and sets monetary transfers between them. Using this objective, we analyze the complexity of learning as a function of preference structure, casting learning as a stochastic multi-armed bandit problem.

artificial intelligence, data mining, learning equilibria, (7 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.09)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback

Learning Equilibria in Matching Markets from Bandit Feedback

Neural Information Processing SystemsOct-9-2024, 16:29:09 GMT

Large-scale, two-sided matching platforms must find market outcomes that align with user preferences while simultaneously learning these preferences from data. But since preferences are inherently uncertain during learning, the classical notion of stability (Gale and Shapley, 1962; Shapley and Shubik, 1971) is unattainable in these settings. To bridge this gap, we develop a framework and algorithms for learning stable market outcomes under uncertainty. Our primary setting is matching with transferable utilities, where the platform both matches agents and sets monetary transfers between them. Using this objective, we analyze the complexity of learning as a function of preference structure, casting learning as a stochastic multi-armed bandit problem.

bandit feedback, learning equilibria, matching market, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback

Filters

Collaborating Authors

learning equilibria

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Learning Equilibria in Matching Markets from Bandit Feedback

Learning Equilibria in Adversarial Team Markov Games: A Nonconvex-Hidden-Concave Min-Max Optimization Problem

Learning Equilibria in Matching Markets from Bandit Feedback

Learning Equilibria in Matching Markets from Bandit Feedback